PVM: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources

نویسندگان

  • Xin Zhang
  • Lingli Ding
  • Elke A. Rundensteiner
چکیده

Data warehouses (DW) are built by gathering information from distributed information sources (ISs) and integrating it into one customized repository. In recent years, work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. Popular solutions such as ECA and Strobe achieve such concurrent maintenance however with the requirement of quiescence of the ISs. More recently, the SWEEP solution releases this quiescence requirement using a local compensation strategy that now processes all update messages in a sequential manner. To optimize upon this sequential processing, we have developed a parallel view maintenance algorithm, called PVM, that incorporates all beneets of previous maintenance approaches while ooering improved performance due to parallelism. In order to perform parallel view maintenance, we have identiied two critical issues: (1) detecting maintenance-concurrent data updates in a parallel mode, and (2) correcting the problem that the DW commit order may not correspond to the DW update processing order due to parallel maintenance handling. In this work, we provide solutions to both issues. Given a modular component-based system architecture, we insert a middle-layer timestamp assignment module for detecting maintenance-concurrent data updates without requiring any global clock synchronization. In addition, we introduce the negative counter concept as a simple yet elegant solution to solve the problem of variant orders of committing eeects of data updates to the DW. We have proven the correctness of PVM to guarantee that our strategy indeed generates the correct nal DW state. We have implemented both SWEEP and PVM in our EVE data warehousing system. Our performance study demonstrates that a multi-fold performance improvement is achieved by PVM over SWEEP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources

Data warehouses (DW) are built by gathering information from several information sources (ISs) and integrating it into one repository customized to users' needs. Recent work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. SWEEP proposed by Agrawal et al. AAS97] is one of the more popular solutions; even though its performance is limited...

متن کامل

Detection and Correction of Connicting Source Updates for Materialized View Maintenance Detection and Correction of Connicting Source Updates for Materialized View Maintenance

Materialized views, often derived from several data sources, must be maintained under source changes. In a distributed context, autonomous source updates can be concurrent and thus cause erroneous maintenance results. State-of-the-art maintenance strategies issue maintenance queries to the sources and apply compensating queries to correct such errors. However, these solutions are limited to han...

متن کامل

Data Warehouse Maintenance under Concurrent Schema and Data Updates

Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at di erent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...

متن کامل

WPI - CS - TR - 98 - 8 August 1998 Data Warehouse Maintenance Under Concurrent Schema

Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at diierent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...

متن کامل

An Architecture of a Data

We present incremental view maintenance algorithms for a data warehouse derived from multiple distributed autonomous data sources. We begin with a detailed framework for analyzing view maintenance algorithms for multiple data sources with concurrent updates. Earlier approaches for view maintenance in the presence of concurrent updates typically require two types of messages: one to compute the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001